Off-line (and on-line) text analysis for computational lexicography
نویسنده
چکیده
Acknowledgements The research reported on in this thesis was based on work in two projects: the DFG-project Deutsches Referenzkorpus, a joint project of the Institut für Maschinelle Sprachverarbeitung (IMS) in Stuttgart, the Seminar für Sprach-wissenschaft (SfS) in Tübingen, and the Institut für deutsche Sprache (IDS) in Mannheim, and the DFG-Transferbereich project Automatische Exzerption which is intended to support the transfer of know-how from universities to companies (project partners are the Institut für Maschinelle Sprachverarbeitung (IMS) in Stuttgart, the Dudenredaktion in Mannheim and the Langenscheidt KG in München). I would like to thank all the members of the IMS for providing an excellent and friendly working environment. In particular I wish to thank my supervisors Christian Rohrer and Ulrich Heid for fruitful discussions, for their feedback and motivation, and for the fact that they were always willing to lend me an ear. I am grateful to Esther König, who was responsible for the DEREKO-project, for her permanent unofficial supervision. She encouraged me to go my own way without losing sight of the main issues. She was of great help discussing conceptual as well as technical issues, thus putting my research on a solid foundation. Special thanks go to Stefan Evert, my office-mate, who not only discussed work matters with me, but is also responsible for implementing, and modifying tools I used for my research. I could not have realized many of the ideas I had, had he not implemented new features, and thus opened up new possibilities. In addition, I want to thank Wolfgang Lezius for providing an import filter for TIGERSearch for my chunker. I also want to thank Stefanie Dipper, Sabine Schulte im Walde, Heike Zinsmeister, and Arne Fitschen for giving me insights into their work, Kristina Spranger for writing a Dutch version of the chunker and for giving remarks on the grammar, and Piklu Gupta for revising my English. Finally, I wish to thank my family and friends for the support, encouragement and distraction they provided me. And, last but not least, I want to thank Thomas for his support and for letting me forget about work.
منابع مشابه
Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملSweep Line Algorithm for Convex Hull Revisited
Convex hull of some given points is the intersection of all convex sets containing them. It is used as primary structure in many other problems in computational geometry and other areas like image processing, model identification, geographical data systems, and triangular computation of a set of points and so on. Computing the convex hull of a set of point is one of the most fundamental and imp...
متن کاملDocument Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)
Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...
متن کاملThe Impact of Semantic Mapping Instruction on Iranian EFL Learners’ Reading Comprehension of Expository Texts
The current article was an attempt to investigate the effect of semantic mapping strategy instruction on reading comprehension performance of EFL learners. To this end, thirty homogeneous Iranian intermediate EFL learners attending a language school in Bonab, Iran, were randomly assigned to two groups, one as the experimental and the other as the control. The experimental group received instruc...
متن کاملThe Impact of Semantic Mapping Instruction on Iranian EFL Learners’ Reading Comprehension of Expository Texts
The current article was an attempt to investigate the effect of semantic mapping strategy instruction on reading comprehension performance of EFL learners. To this end, thirty homogeneous Iranian intermediate EFL learners attending a language school in Bonab, Iran, were randomly assigned to two groups, one as the experimental and the other as the control. The experimental group received instruc...
متن کاملA Heuristic Algorithm for Nonlinear Lexicography Goal Programming with an Efficient Initial Solution
In this paper, a heuristic algorithm is proposed in order to solve a nonlinear lexicography goal programming (NLGP) by using an efficient initial point. Some numerical experiments showed that the search quality by the proposed heuristic in a multiple objectives problem depends on the initial point features, so in the proposed approach the initial point is retrieved by Data Envelopment Analysis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003